Computational Biology Lecture 2: Some problems in biology

نویسنده

  • Saad Mneimneh
چکیده

Specific genes, or specific gene mutations, are frequently the cause of diseases. In order to help us identify when a disease is associated with the presence of a particular gene, it is useful to have a map of the genome. Genetic mapping is the process of determining the relative location of genes in a particular genome. By mapping the relative location of several genes, we can develop a map of the entire genome of a particular species e.g. humans. Genetic mapping relies on externally observable genetic traits, or phenotypes, of a particular genome. By selectively breeding specific phenotypes together and examining the phenotypes of several generations of children, we can determine the frequency of the occurrence of particular phenotypes. After a large number of observations, we have a reasonable approximation of the probability of the occurrence of those particular phenotypes. Based on these probabilities, we can get an estimate of the distance between specific genes. As an example, let’s take a genome that has n genes on a single chromosome and that recombination occurs randomly at only one point on the chromosome. Therefore, if the father’s chromosome is f1...fn and the mother’s chromosome is m1...mn, a child can have either an f1...fimi+1...mn or an m1...mifi+1...fn chromosome, for some recombination position 0 ≤ i ≤ n. As a result, every pair of parents can have 2(n + 1) possible kinds of children (some might be identical) based on this single recombination position model. The probability of this recombination occurring at a particular position is p = 1 n+1 . The probability of two genes being separated by recombination is the probability that the recombination occurs in any position between them. This probability is expressed as p = d n+1 , where n is the number of genes in the genome and d is the distance between the two genes. Note that closer genes will have less chance of recombination (this is where the second law of Mendel is wrong, genes are not inherited independently if they are on the same chromosome). By starting with two different pure breed parents, say black and blue, we can concentrate on the states of two genes and observe the frequency of the states being different in the children (i.e. the frequency of recombination between those two genes). This will help us estimate p and therefore d. If we are able to determine the distance between all pairs of genes in our example genome, then we can use these distances to determine the exact sequence of the genes. However, real life is not as simple as our example. In general, recombination occurs at an arbitrary number of positions in the genome. There are not always distinct phenotypes for a particular gene, or gene combination. This means that single changes cannot always be observed and in fact usually it takes multiple changes before there are any changes in phenotype. Moreover, some differences in the genotype are not reflected in any phenotype. It is highly probable that separate genes are distant and randomly distributed among multiple chromosomes. This makes our job even more difficult. On the other hand, if genes are particularly close, the order of the genes can not be determined because the two genes almost never occur separately.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introduction to Computational Biology Lecture # 17: RNA Structure from Sequence

In the previous lesson we presented a Stochastic Context Free Grammar model for prediction of RNA secondary structure from sequence. This model enables us to assign a probabily for every possible folding of a given RNA sequence. We will start this lecture with some comments about the relation between probabilities and energies. Then we will learn how to calculate the probabilities of structural...

متن کامل

Lecture 9 — February 21

In this lecture, we will define different types of alignments and explain some of their properties. We begin with a suitable alphabet Σ. Throughout the discussion, we will choose Σ = {A,C,G, T}, the four nucleotides, although note that the definitions hold for all finite alphabets. Examples of other alphabets include Σ′ = {A,C,G, T,N}, where N denotes an unknown base. The alphabet Σ′ is widely ...

متن کامل

Problem Based Learning or Lecture, A New Method of Teaching Biology to First Year Medical Students: An Experience

Introduction. In the previous studies in the field of medical education, problem based learning and lecture based learning have been compared, but, due to the learning habits of Iranian students and special condition of education, the effects of these two methods have been less investigated in Iranian universities so far. This study attempts to compare the effects of these two methods on studen...

متن کامل

Cs 395t: Algorithms for Computational Biology 12.2 Applications of Dynamic Programming (hw) 12.2.1 Computing the Topological Diameter of a Tree

Each student should select a paper to present to the class and notify Professor Warnow of this selection by e-mail prior to next Tuesday's lecture. Scribe Notes. Beginning with today's lecture, a designated scribe will be responsible for taking notes on each lecture, typesetting a polished and thorough version of the notes in L A T E X, e-mailing the notes in both .tex and .pdf formats to Profe...

متن کامل

Integer Quadratic Programming Models in Computational Biology

This presentation has two purposes: (1) show operations researchers how they can apply quadratic binary programming to current problems in molecular biology, and (2) show formulations of some combinatorial optimization problems as integer programs. The former purpose is primary, and I wish to persuade researchers to enter this exciting frontier. The latter purpose is part of a work in progress....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004